epsilon-Kernel Coresets for Stochastic Points

نویسندگان

  • Lingxiao Huang
  • Jian Li
  • Jeff M. Phillips
  • Haitao Wang
چکیده

With the dramatic growth in the number of application domains that generateprobabilistic, noisy and uncertain data, there has been an increasing interest in designingalgorithms for geometric or combinatorial optimization problems over such data. Inthis paper, we initiate the study of constructing ε-kernel coresets for uncertain points.We consider uncertainty in the existential model where each point’s location is fixedbut only occurs with a certain probability, and the locational model where each pointhas a probability distribution describing its location. An ε-kernel coreset approximatesthe width of a point set in any direction. We consider approximating the expectedwidth (an ε-exp-kernel), as well as the probability distribution on the width (an(ε, τ)-quant-kernel) for any direction. We show that there exists a set of O(ε−(d−1)/2)deterministic points which approximate the expected width under the existential andlocational models, and we provide efficient algorithms for constructing such coresets.We show, however, it is not always possible to find a subset of the original uncertainpoints which provides such an approximation. However, if the existential probability ofeach point is lower bounded by a constant, an ε-exp-kernel is still possible. We alsoprovide efficient algorithms for construct an (ε, τ)-quant-kernel coreset in nearlylinear time. Our techniques utilize or connect to several important notions in probabilityand geometry, such as Kolmogorov distances, VC uniform convergence and Tukey depth,and may be useful in other geometric optimization problem in stochastic settings.Finally, combining with known techniques, we show a few applications to approximatingthe extent of uncertain functions, maintaining extent measures for stochastic movingpoints and some shape fitting problems under uncertainty.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

$ε$-Kernel Coresets for Stochastic Points

With the dramatic growth in the number of application domains that generate probabilistic, noisy and uncertain data, there has been an increasing interest in designing algorithms for geometric or combinatorial optimization problems over such data. In this paper, we initiate the study of constructing ε-kernel coresets for uncertain points. We consider uncertainty in the existential model where e...

متن کامل

eps-Kernel Coresets for Stochastic Points

7 We study the problem of constructing ε-kernel coresets for uncertain points. We consider uncertainty8under the existential model where each point’s location is fixed but only occurs with a certain probability,9and the locational model where each point has a probability distribution describing its location. An ε-10kernel coreset approximates the width of a point set...

متن کامل

Near-Optimal Coresets of Kernel Density Estimates

We construct near-optimal coresets for kernel density estimate for points in Rd when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size O( √ d log(1/ε)/ε), and we show a near-matching lower bound of size Ω( √ d/ε). The upper bound is a polynomial in 1/ε improvement when d ∈ [3, 1/ε2) (for all kernels except the Gaussian kernel which had a ...

متن کامل

Visualization of Big Spatial Data using Coresets for Kernel Density Estimates

The size of large, geo-located datasets has reached scales where visualization of all data points is inefficient. Random sampling is a method to reduce the size of a dataset, yet it can introduce unwanted errors. We describe a method for subsampling of spatial data suitable for creating kernel density estimates from very large data and demonstrate that it results in less error than random sampl...

متن کامل

On Coreset Constructions for the Fuzzy $K$-Means Problem

In this paper, we present coreset constructions for the fuzzy Kmeans problem. First, we show that one can construct a weak coresets for fuzzy K-means. Second, we show that there are coresets for fuzzy K-means with respect to balanced fuzzy K-means solutions. Third, we use these coresets to develop a randomized approximation algorithm whose runtime is polynomial in the number of the given points...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016